Concept-Based Topic Model Improvement
نویسندگان
چکیده
We propose a system which employs conceptual knowledge to improve topic models by removing unrelated words from the simplified topic description. We use WordNet to detect which topical words are not conceptually similar to the others and then test our assumptions against human judgment. Results obtained on two different corpora in different test conditions show that the words detected as unrelated had a much greater probability than the others to be chosen by human evaluators as not being part of the topic at all. We prove that there is a strong correlation between the said probability and an automatically calculated topical fitness and we discuss the variation of the correlation depending on the method and data used.
منابع مشابه
Vector model improvement by FCA and Topic Evolution
Presented research is based on standard methods of information retrieval using the vector model for representation of documents (objects). The vector model is often expanded to get better precision and recall. In this article we have mentioned two approaches of vector model expansion. The first approach is based on hierarchical clustering. Its goal is to find a list of all documents they have m...
متن کاملUse of Concept Map as a reinforcement tool in Undergraduate Curriculum: An analytical study
Introduction: Ever-expanding medical literature demandssuccessful amalgamation of huge information and clinical practicefor budding doctors. This study aimed to find the effectivenessof the concept map, a novel method of teaching to improveperformance among undergraduate pharmacology students.Methods: The undergraduate medical students pursuingpharmacology in...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملGraph-based Visual Saliency Model using Background Color
Visual saliency is a cognitive psychology concept that makes some stimuli of a scene stand out relative to their neighbors and attract our attention. Computing visual saliency is a topic of recent interest. Here, we propose a graph-based method for saliency detection, which contains three stages: pre-processing, initial saliency detection and final saliency detection. The initial saliency map i...
متن کاملStudying Dynamic behavior of Distributed Parameter Processes Behavior Based on Dominant Gain Concept and it’s Use in Controlling these Processes
In this paper, distributed parameter process systems behavior is studied in frequency domain. Based on the dominant gain concept that is developed for such studies, a method is presented to control distributed parameter process systems. By using dominant gain concept, the location of open loop zeros, resulted from the time delay parameter in the process model, were changed from the right half p...
متن کامل